DeepMind's reinforcement learning framework: Acme
Acme provides higher level API, a simple training code can be something like following;
loop = acme.EnvironmentLoop(environment, agent) loop.run()
Any environments with DeepMind Environment API can be used in this training loop. Section 3 shows the detail.
Even though Acme is described as a framework for distributed reinforcement learning at the technical report, DeepMind does not publish distributed agents and they doesn’t have any timetable to release them, unfortunately. (See FAQ)
Acme can be installed from PyPI with pip. The package name is “dm-acme”. There are some install options. I think that the option “reverb” is almost always necessary for replay buffer, and that one of “tf” and “jax” is neccessary, too.
So that, the installation command is
pip install dm-acme[reverb,tf]
pip install dm-acme[reverb,jax]
Additionally, you can use an option “env” to install some environments like “dm-control” and “gym”.
The abstract class is defined as
here. One of the largest difference from
gym.Env is returning
dm_env.TimeStep class instead of simple Python
TimeStep class also tracks the type of the step is the first (
StepType.FIRST), mid (
StepType.MID), or the last (
StepType.LAST) step in the trajectory.
gym.Env, Acme prepares wrapper
import acme import gym env = acme.wrappers.GymWrapper(gym.make("MountainCarContinuous-v0"))
The list of agents are shown at here. Currently (August 2020), 11 agents are provided.
- Continuous control
- Deep Deterministic Policy Gradient (DDPG)
- Distributed Distributional Deep Determinist (D4PG)
- Maximum a posteriori Policy Optimisation (MPO)
- Distributional Maximum a posteriori Policy Optimisation (DMPO)
- Discrete control
- Deep Q-Networks (DQN)
- Importance-Weighted Actor-Learner Architectures (IMPALA)
- Recurrent Replay Distributed DQN (R2D2)
- Batch RL
- Behavior Cloning (BC)
- Learning from demonstraitions
- Deep Q-Learning from Demonstrations (DQfD)
- Recurrent Replay Distributed DQN from Demonstratinos (R2D3)
- Model-based RL
- Monte-Carlo Tree Search (MCTS)
These agents extends
acme.agents.agent.Agent. You can also create your own custom algorithms by similar way.